Alpha-Divergences in Variational Dropout

نویسندگان

Bogdan Mazoure

Riashat Islam

چکیده

We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout [10]. Stochastic gradient variational Bayes (SGVB) [9] is a general framework for estimating the evidence lower bound (ELBO) in Variational Bayes. In this work, we extend the SGVB estimator with using Alpha-Divergences, which are alternative to divergences to VI’ KL objective. The Gaussian dropout can be seen as a local reparametrization trick of the SGVB objective. We extend the Variational Dropout to use alpha divergences for variational inference. Our results compare α-divergence variational dropout with standard variational dropout with correlated and uncorrelated weight noise. We show that the α-divergence with α → 1 (or KL divergence) is still a good measure for use in variational inference, in spite of the efficient use of Alpha-divergences for Dropout VI [14]. α→ 1 can yield the lowest training error, and optimizes a good lower bound for the evidence lower bound (ELBO) among all values of the parameter α ∈ [0,∞).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

To obtain uncertainty estimates with real-world Bayesian deep learning models, practical inference approximations are needed. Dropout variational inference (VI) for example has been used for machine vision and medical applications, but VI can severely underestimates model uncertainty. Alpha-divergences are alternative divergences to VI’s KL objective, which are able to avoid VI’s uncertainty un...

متن کامل

Perturbative Black Box Variational Inference

Black box variational inference (BBVI) with reparameterization gradients triggered the exploration of divergence measures other than the Kullback-Leibler (KL) divergence, such as alpha divergences. In this paper, we view BBVI with generalized divergences as a form of estimating the marginal likelihood via biased importance sampling. The choice of divergence determines a bias-variance trade-off ...

متن کامل

Variational Dropout Sparsifies Deep Neural Networks

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse sol...

متن کامل

Information Dropout: learning optimal representations through noise

We introduce Information Dropout, a generalization of dropout that is motivated by the Information Bottleneck principle and highlights the way in which injecting noise in the activations can help in learning optimal representations of the data. Information Dropout is rooted in information theoretic principles, it includes as special cases several existing dropout methods, like Gaussian Dropout ...

متن کامل

Differentially Private Variational Dropout

Deep neural networks with their large number of parameters are highly flexible learning systems. The high flexibility in such networks brings with some serious problems such as overfitting, and regularization is used to address this problem. A currently popular and effective regularization technique for controlling the overfitting is dropout. Often, large data collections required for neural ne...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1711.04345 شماره

صفحات -

تاریخ انتشار 2017

Alpha-Divergences in Variational Dropout

نویسندگان

چکیده

منابع مشابه

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

Perturbative Black Box Variational Inference

Variational Dropout Sparsifies Deep Neural Networks

Information Dropout: learning optimal representations through noise

Differentially Private Variational Dropout

عنوان ژورنال:

اشتراک گذاری